NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Preliminary Experiments with Transformer based Approaches To Automatically Inferring Domain Models from Textbooks

Banjade, Rabin; Oli, Priti; Tamang, Lasang Jimba; Rus, Vasile (July 2022, Proceedings of the 15th International Conference on Educational Data Mining)

Domain modeling is a central component in education technologies as it represents the target domain students are supposed to train on and eventually master. Automatically generating domain models can lead to substantial cost and scalability benefits. Automatically extracting key concepts or knowledge components from, for instance, textbooks can enable the development of automatic or semi-automatic processes for creating domain models. We explore in this work the use of transformer based pre-trained models for the task of keyphrase extraction. Specifically, we investigate and evaluate four different variants of BERT, a pre-trained transformer based architecture, that vary in terms of training data, training objective, or training strategy to extract knowledge components from textbooks for the domain of intro-to-programming. We report results obtained using the following BERT-based models: BERT, CodeBERT, SciBERT and RoBERTa.
more » « less
Full Text Available
DeepCode: An Annotated Set of Instructional Code Examples to Foster Deep Code Comprehension and Learning

https://doi.org/10.1007/978-3-031-09680-8_4

Rus, Vasile; Brusilovsky, Peter; Tamang, Lasang Jimba; Akhuseyinoglu, Kamil; Fleming, Scott (June 2022, Proceedings of 18th International Conference on Intelligent Tutoring Systems)
Crossley, Scott; Popescu, Elvira (Ed.)
We present here a novel instructional resource, called DeepCode, to support deep code comprehension and learning in intro-to-programming courses (CS1 and CS2). DeepCode is a set of instructional code examples which we call a codeset and which was annotated by our team with comments (e.g., explaining the logical steps of the underlying problem being solved) and related instructional questions that can play the role of hints meant to help learners think about and articulate explanations of the code. While DeepCode was designed primarily to serve our larger efforts of developing an intelligent tutoring system (ITS) that fosters the monitoring, assessment, and development of code comprehension skills for students learning to program, the codeset can be used for other purposes such as assessment, problem-solving, and in various other learning activities such as studying worked-out code examples with explanations and code visualizations. We present here the underlying principles, theories, and frameworks behind our design process, the annotation guidelines, and summarize the resulting codeset of 98 annotated Java code examples which include 7,157 lines of code (including comments), 260 logical steps, 260 logical step details, 408 statement level comments, and 590 scaffolding questions.
more » « less
Full Text Available
Automated Assessment of Quality of Jupyter Notebooks Using Artificial Intelligence and Big Code

https://doi.org/10.32473/flairs.v34i1.128560

Oli, Priti; Banjade, Rabin; Tamang, Lasang Jimba; Rus, Vasile (May 2021, The International FLAIRS Conference Proceedings)

We present in this paper an automated method to assess the quality of Jupyter notebooks. The quality of notebooks is assessed in terms of reproducibility and executability. Specifically, we automatically extract a number of expert-defined features for each notebook, perform a feature selection step, and then trained supervised binary classifiers to predict whether a notebook is reproducible and executable, respectively. We also experimented with semantic code embeddings to capture the notebooks' semantics. We have evaluated these methods on a dataset of 306,539 notebooks and achieved an F1 score of 0.87 for reproducibility and 0.96 for executability (using expert-defined features) and an F1 score of 0.81 for reproducibility and 0.78 for executability (using code embeddings). Our results suggest that semantic code embeddings can be used to determine with good performance the reproducibility and executability of Jupyter notebooks, and since they can be automatically derived, they have the advantage of no need for expert involvement to define features.
more » « less
Full Text Available
A Comparative Study of Free Self-Explanations and Socratic Tutoring Explanations for Source Code Comprehension

https://doi.org/10.1145/3408877.3432423

Tamang, Lasang Jimba; Alshaikh, Zeyad; Khayi, Nisrine Ait; Oli, Priti; Rus, Vasile (March 2021, Proceedings of the 52nd ACM Technical Symposium on Computer Science Education)
null (Ed.)
We present in this paper the results of a randomized control trial experiment that compared the effectiveness of two instructional strategies that scaffold learners' code comprehension processes: eliciting Free Self-Explanation and a Socratic Method. Code comprehension, i.e., understanding source code, is a critical skill for both learners and professionals. Improving learners' code comprehension skills should result in improved learning which in turn should help with retention in intro-to-programming courses which are notorious for suffering from very high attrition rates due to the complexity of programming topics. To this end, the reported experiment is meant to explore the effectiveness of various strategies to elicit self-explanation as a way to improve comprehension and learning during complex code comprehension and learning activities in intro-to-programming courses. The experiment showed pre-/post-test learning gains of 30% (M = 0.30, SD = 0.47) for the Free Self-Explanation condition and learning gains of 59% (M = 0.59,SD = 0.39) for the Socratic method. Furthermore, we investigated the behavior of the two strategies as a function of students' prior knowledge which was measured using learners' pretest score. For the Free Self-Explanation condition, there was no significant difference in mean learning gains for low vs. high knowledge students. The magnitude of the difference in performance (mean difference= 0.02,95% CI: -0.34 to 0.39) was very small (eta squared = 0.006). Likewise, the Socratic method showed no significant difference in mean learning gains between low vs. high performing students. The magnitude of the performance difference (mean difference =-0.24,95% CI: -0.534 to 0.03) was large (eta squared = 0.10). These findings suggest that eliciting self-explanations can be used as an effective strategy and that guided self-explanations as in the Socratic method condition is more effective at inducing learning gains.
more » « less
Full Text Available
Experiments with a Socratic Intelligent Tutoring System for Source Code Understanding

Alshaikh, Zeyad; Tamang, Lasang Jimba; Rus, Vasile (April 2020, The Thirty-Third International Florida Artificial Intelligence Research Society Conference (FLAIRS-32))

Computer Science (CS) education is critical in todays world, and introductory programming courses are considered extremely difficult and frustrating, often considered a major stumbling block for students willing to pursue computer programming related careers. In this paper, we describe the design of Socratic Tutor, an Intelligent Tutoring System that can help novice programmers to better understand programming concepts. The system was inspired by the Socratic method of teaching in which the main goal is to ask a set of guiding questions about key concepts and major steps or segments of complete code examples. To evaluate the Socratic Tutor, we conducted a pilot study with 34 computer science students and the results are promising in terms of learning gains.
more » « less
Full Text Available

Search for: All records